THE NUMBER OF BIT COMPARISONS USED BY QUICKSORT: 
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Abstract. The analyses of many algorithms and data structures (such 
as digital search trees) for searching and sorting are based on the rep- 
resentation of the keys involved as bit strings and so count the number 
of bit comparisons. On the other hand, the standard analyses of many 
other algorithms (such as Quicksort) are performed in terms of the num- 
ber of key comparisons. We introduce the prospect of a fair comparison 
between algorithms of the two types by providing an average-case anal- 
ysis of the number of bit comparisons required by Quicksort. Counting 
bit comparisons rather than key comparisons introduces an extra loga- 
rithmic factor to the asymptotic average total. We also provide a new 
algorithm, "BitsQuick", that reduces this factor to constant order by 
eliminating needless bit comparisons. 



1. Introduction and summary 

Algorithms for sorting and searching (together with their accompanying analy- 
ses) generally fall into one of two categories: either the algorithm is regarded as 
comparing items pairwise irrespective of their internal structure (and so the anal- 
ysis focuses on the number of comparisons), or else it is recognized that the items 
(typically numbers) are represented as bit strings and that the algorithm operates 
on the individual bits. Typical examples of the two types are Quicksort and digital 
search trees, respectively; see [IS] . 

In this paper — a substantial expansion of the extended abstract [7] — we take a 
first step towards bridging the gap between the two points of view, in order to 
facilitate run-time comparisons across the gap, by answering the following question 
posed many years ago by Bob Sedgewick [personal communication]: What is the 
bit complexity of Quicksort? (For a discussion of related work that has transpired 
in the time between [7] and this paper, see Remark ll.Gl at the end of this section.) 

More precisely, we consider Quicksort (see Section [2] for a review) applied 
to n distinct keys (numbers) from the interval (0, 1). Many authors (Knuth |15j . 
Regnier [T!5], Rosier [3T], Knessl and Szpankowski [13], Fill and Janson [5] [BJ, 
Neininger and Ruschendorff [T8] , and others) have studied K n , the (random) num- 
ber of key comparisons performed by the algorithm. This is a natural measure of 
the cost (run-time) of the algorithm, if each comparison has the same cost. On 
the other hand, if comparisons are done by scanning the bit representations of the 
numbers, comparing their bits one by one, then the cost of comparing two keys is 
determined by the number of bits compared until a difference is found. We call this 
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number the number of bit comparisons for the key comparison, and let B n denote 
the total number of bit comparisons when n keys are sorted by Quicksort. 

We assume that the keys X\ , . . . , X n to be sorted are independent random vari- 
ables with a common continuous distribution F over (0, 1). It is well known that 
the distribution of the number K n of key comparisons does not depend on F. This 
invariance clearly fails to extend to the number B n of bit comparisons, and so we 
need to specify F. 

For simplicity, we study mainly the case that F is the uniform distribution, and, 
throughout, the reader should assume this as the default. But we also give a result 
valid for a general absolutely continuous distribution F over (0, 1) (subject to a 
mild integrability condition on the density). 

In this paper we focus on the mean of B n . One of our main results is the following 
Theorem [Til the concise version of which is the asymptotic equivalence 



E B„ 



?i(lnn)(lgn) as n 



Throughout, we use In (respectively, lg) to denote natural (resp., binary) logarithm, 
and use log when the base doesn't matter (for example, in remainder estimates). 
The symbol = is used to denote approximate equality, and 7 = 0.57722 is Euler's 
constant. 



Theorem 1.1. // the keys X\, . . . ,X n are independent and uniformly distributed 
on (0, 1), then the number B n of bit comparisons required to sort these keys using 
Quicksort has expectation given by the following exact and asymptotic expressions: 



(1.1) 
(1.2) 



E B„ 



2 £ ( -<) 



(fc-l)fc[l-2-( fe - 1 )] 



n(lnn)(lgn) — cinlnn + C2n + ir n n + O(logn), 



where, with j3 := 27r/ln2, 



Cl 

C2 



-L(4 - 2 7 -ln2)= 3.105, 
m 2 



1 
m~2 



1(6 -ln2) 2 - (4 -In 2)7+^+72 
6 6 



6.872, 



(1.3) 



E 



l: k^0 



7rfc(-l - i/3k) 



T{-l~ipk)n l P k 



is periodic in lg n with period 1 and amplitude smaller than 5 X 10 



Small periodic fluctuations as in Theorem 11.11 come as a surprise to newcomers 
to the analysis of algorithms but in fact are quite common in the analysis of digital 
structures and algorithms; see, for example, Chapter 6 in [16] ■ 

For our further results, it is technically convenient to assume that the number 
of keys is no longer fixed at n, but rather Poisson distributed with mean A and 
independent of the values of the keys. (In this paper, we shall not deal with the 
"de-Poissonization" that would be needed to transfer results back to the fixed-n 
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model.) In obvious notation, the Poissonized version of (|1.1|) - (|1.2[) is 

00 \ k 1 

(1.4) E B (» = ig-lf g > (1 _ 1)t|1 _ 2 _, M| 

(1.5) = A(ln A)(lg A) - ciAlnA + c 2 A + n x X + 0{\og A) as A -)• 00, 

with tt\ as in (jl .3|) . The exact formula follows immediately from (|1.1[) . and the 
asymptotic formula is established in Section [5] as Proposition 15.41 We will also see 
(Proposition [5TBJ) that Var_B(A) = 0(A 2 ), so B(X) is concentrated about its mean. 
Since the number K{X) of key comparisons is likewise concentrated about its mean 
Eif(A) ~ 2AlnA for large A (see Lemmas 15.11 and I5.3[) . it follows that 

2 B(X] 

(1.6) - — - x —ttt 1 m probability as A — > 00. 
IgA K{\) 

In other words, about i lg A bits are compared per key comparison. 

Remark 1.2. Further terms can be obtained in (|1.2I) and (| 1 . 5|) by the methods 
used in the proofs below. In particular, the O(logA) in (|1.5[) can be refined to 

-21ogA-c 4 + 0(A- M ) 

for any fixed M , with 

c 4 :=41n2 + 2 + 2 7 = 5.927. 

For non-uniform distribution F, we have the same leading term for the asymp- 
totic expansion of E_B(A), but the second-order term is larger. (Throughout, ln + 
denotes the positive part of the natural logarithm function. We denote the uniform 
distribution by unif.) 

Theorem 1.3. Let X\,Xi, ... be independent with a common distribution F over 
(0, 1) having density f, and let N be independent and Poisson with mean A. // 
f (\n + f) 4 < 00, then the expected number of bit comparisons, call it /if (A), re- 
quired to sort the keys X%, . . . , X?v using Quicksort satisfies 

A*/ (A) = aw(A) + 2ff(/)AlnA + o(Alog A) 

as A — > oo, where H(f) :— J^flgf > is the entropy (in bits) of the density f. 

In applications, it may be unrealistic to assume that a specific density / is known. 
Nevertheless, even in such cases, Theorem 11.31 may be useful since it provides a 
measure of the robustness of the asymptotic estimate in Theorem 11.11 

Bob Sedgewick (among others who heard us speak on the material of this paper) 
suggested that the number of bit comparisons for Quicksort might be reduced 
substantially by not comparing bits that have to be equal according to the results 
of earlier steps in the algorithm. In the final section (Theorem IT. 1[) . we note that 
this is indeed the case: For a fixed number n of keys, the average number of bit 
comparisons in the improved algorithm (which we dub "BitsQuick") is asymptot- 
ically equivalent to 2(1 + -^p^n Inn, only a constant (= 3.2) times the average 
number of key comparisons [see (I2.2p j. A related algorithm is the digital version 
of Quicksort by Roura [22]; it too requires 9(nlogn) bit comparisons (we do not 
know the exact constant factor). 

We may compare our results to those obtained for radix-based methods, for ex- 
ample radix exchange sorting, see [El Section 5.2.2]. This method works by bit 
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inspections, that is, by comparisons to constant bits, rather than by pairwise com- 
parisons. In the case of n uniformly distributed keys, radix exchange sorting uses 
asymptotically n lg n bit inspections. Since radix exchange sorting is designed so 
that the number of bit inspections is minimal, it is not surprising that our results 
show that Quicksort uses more bit comparisons. More precisely, Theorem 11.11 
shows that Quicksort uses about Inn times as many bit comparisons as radix ex- 
change sorting. For BitsQuick, this is reduced to a small constant factor. This gives 
us a measure of the cost in bit comparisons of using these algorithms; Quicksort 
is often used because of other advantages, and our results open the possibility of 
seeing when they outweigh the increase in bit comparisons. 

In Section [2] we review Quicksort itself and basic facts about the number K n 
of key comparisons. In Section [3] we derive the exact formula (jl.ljl for ~EB n , and 
in Section |4] we derive the asymptotic expansion (|1.2|l from an alternative exact 
formula that is somewhat less elementary than (|1.1[) but much more transparent 
for asymptotics. In the transitional Section [5] we establish certain basic facts about 
the moments of K(X) and -B(A) in the Poisson case with uniformly distributed keys, 
and in Section [6] we use martingale arguments to establish Theorem 11.31 for the 
expected number of bit comparisons for Poisson(A) draws from a general density /. 
Finally, in Section [7] we study the improved BitsQuick algorithm discussed in the 
preceding paragraph. 

Remark 1.4. The results can be generalized to bases other than 2. For example, 
base 256 would give corresponding results on the "byte complexity" . 

Remark 1.5. Cutting off and sorting small subfiles differently would affect the 
results in Theorems 11.11 and 11.31 by O(nlogn) and O(AlogA) only. In particular, 
the leading terms would remain the same. 

Remark 1.6. In comparison with the extended abstract [7], new in this expanded 
treatment are Remark 15.21 Propositions 15 .41 and 15 . 7\ and Lemma IB~2| together with 
complete proofs of Theorem 11.31 Lemmas 15.11 and 15.31 and Remark 16.31 Section [7] 
has been substantially revised. 

In the time between [7] and the present paper, the following developments have 
occurred: 

• Fill and Nakama [8 followed the same sort of approach as in this paper 
to obtain certain exact and asymptotic expressions for the number of bit 
comparisons required by Quickselect, a close cousin of Quicksort. 

• Vallee et al. [23] used analytic-combinatorial methods to extend the results 
of [7] and [8] by deriving asymptotic expressions for the expected number 
of symbol comparisons for both Quicksort and Quickselect. In their 
work, as in the present paper, the keys are assumed to be independent and 
identically distributed, but the authors allow for quite general probabilistic 
models (also known as "sources" ) for how each key is generated as a symbol 
string. 

• Fill and Nakama [5] (see also [T7]) obtained, for quite general sources, a 
limiting distribution for the (suitably scale-normalized) number of symbol 
comparisons required by Quickselect. 

• Fill A', obtained, for quite general sources, a limiting distribution for the 
(suitably center-and-scale-normalized) number of symbol comparisons re- 
quired by Quicksort. 
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We were motivated to expand [7] to the present full-length paper in large part 
because this paper's Lemmas 15.11 and 1 5 . 3[ and an extension of (the proof of) Propo- 
sition [5771 play key roles in 0]. 

2. Review: number of key comparisons used by Quicksort 

In this section we briefly review certain basic known results concerning the num- 
ber K n of key comparisons required by Quicksort for a fixed number n of keys 
uniformly distributed on (0, 1). (See, for example, [6] and the references therein for 
further details.) 

Quicksort, invented by Hoare [13], is the standard sorting procedure in Unix 
systems, and has been cited [3] as one of the ten algorithms "with the greatest 
influence on the development and practice of science and engineering in the 20th 
century." The Quicksort algorithm for sorting an array of n distinct keys is very 
simple to describe. If n = or n — 1, there is nothing to do. If n > 2, pick a key 
uniformly at random from the given array and call it the "pivot". Compare the 
other keys to the pivot to partition the remaining keys into two subarrays. Then 
recursively invoke Quicksort on each of the two subarrays. 

With Kq :— as initial condition, K n satisfies the distributional recurrence 
relation 

K n = K Un _ x + Kl_ Un +n-l, n>l, 

where = denotes equality in law (i.e., in distribution), and where, on the right, U n 
is distributed uniformly over the set {1, . . . , n}, K* — Kj for every j, and 

U n ', Kq, . . . , K n -x\ K Q , . . . , K n _ x 

are all independent. 

Passing to expectations we obtain the "divide-and-conquer" recurrence relation 

2 n_1 

EK n = - EK j + n - 1, 

j=o 

which is easily solved to give 

(2.1) E K n — 2(n+ l)H n — An 

(2.2) = 2nlnn - (4 - 27)71 + 2 Inn + (2 7 + 1) + 0(l/n). 

It is also routine to use a recurrence to compute explicitly the exact variance of K n . 
In particular, the asymptotics are 

Var K n = <j 2 n 2 — 2n In n + 0(n) 

where a 2 := 7 — |7r 2 = 0.4203. Higher moments can be handled similarly. Further, 
the normalized sequence 

k n :=(K n -EK n )/n, n > 1, 

converges in distribution, with convergence of moments of each order, to K, where 
the law of K is characterized as the unique distribution over the real line with 
vanishing mean that satisfies a certain distributional identity; and the moment 
generating functions of K n converge pointwise to that of K. 
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3. Exact mean number of bit comparisons 

In this section we establish the exact formula (jl.lj) . repeated here for convenience 
as (|3.1[) . for the expected number of bit comparisons required by Quicksort for a 
fixed number n of keys uniformly distributed on (0, 1): 



(3.1) 



fc=2 V / 



1 



(k - l)k[l - 2-( fc -!)] 



Let X\, . . . ,X n denote the keys, and Xh) < • • • < X^ their order statistics. 
Consider ranks 1 < i < j < n. Formula (13. ip follows readily from the following 
three facts, all either obvious or very well known: 

• The event CVj := {keys Xu-\ and Xu-\ are compared} and the random 
vector (Xti), X/j\) are independent. 

• P(Cjj) = 2/(j — i+ 1). [Indeed, equals the event that the first pivot 
chosen from among X^, . . . , X^ is either X^ or X^y] 

• The joint density g n<it j of (X^,X^) is given by 

(3.2) 9n , iAx , v) = (. _ , ltj _ Mj n _ .) ^(y - - v)»->. 

Let b(x, y) denote the index of the first bit at which the numbers x, y € (0, 1) dif- 
fer. (For definiteness we take in this paper the terminating expansion with infinitely 
many zeros for dyadic rationals in [0, 1), but 1 = .111 ) Then 

(3.3) eb„= Yl p ( c ^) / / b ^ 

9n,i,j y) dy dx 

l<Kj<n J ° Jx 

-1 1-1 

b(x,y)p n (x,y)dydx, 



10 Jx 

where p n (x,y) has the definition and interpretation 
p n (x,y) := Y P(Cij)g nji j(x,y)dydx 

l<i<j<n 

P(keys in (x, x + dx) and (y, y + dy) are compared) 
dx dy 

By a routine calculation, 

(3.4) p n (x, y) = ^ 2 [(1 - (y - x)) n - 1 + n(y - x)} 



(y - x) 2 



k=2 v 7 



(y - x) 



k-1 



which depends on x and y only through the difference y — x. Plugging Q3.4p 
into (|3.3[) . we find 

E5„ = 2^(-l) fc (l) [ f b(x, y)(y - x) k ~ 2 dydx. 



fe=2 
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But, by routine (if somewhat lengthy) calculation, 



b(x,y)(y-x) k - 2 dydx = y2(£ + l) // (y - x) k ~ 2 dx dy 

Jx e=Q JJo<x<y<l:b(x,y)=e+l 



oa „ 

^2(1+1)2* 
1=0 J " 
1 



2 -(i+i) r2 -e 

(y - x) k ~ 2 dydx 



(fc-l)/c[l-2-( fc ^ 1 )]' 
This now leads immediately to the desired (|3.1[) . 



4. Asymptotic mean number of bit comparisons 

Formula (jl.ip . repeated at (|3.ip . is hardly suitable for numerical calculations or 
asymptotic treatment, due to excessive cancellations in the alternating sum. Indeed, 
if (say) n = 100, then the terms (including the factor 2, for definiteness) alternate 
in sign, with magnitude as large as 10 25 , and yet E_B„ = 2295. Fortunately, 
there is a standard complex- analytic technique designed for precisely our situation 
(alternating binomial sums), namely, Rice's method. We will not review the idea 
behind the method here, but rather refer the reader to (for example) Section 6.4 
of [H5]. Let 

2 

H(Z) (z-l)z[l-2-(-D] 

and let B(z,w) := T(z)T(w)/T(z + w) denote the (meromorphic continuation) of 
the classical beta function. According to Rice's method, E B n equals the sum of 
the residues of the function B(n + 1, —z)h(z) at 

• the triple pole at z = 1; 

• the simple poles at z = 1 + i/3k, for k £ Z \ {0}; 

• the double pole at z = 0. 

The residues are easily calculated, especially with the aid of such symbolic-manip- 
ulation software as Mathematica or Maple. Corresponding to the above list, the 
residues equal 

• Tn2 [HLi - (4 - ln2)ff„_ x + i(6 - ln2) 2 + fl£> 

• 7rfc(-i-i/3fc) - i(3k) f^rzpfe) ; 

• -2( J ff„ + 21n2 + l), 

where -ffn^ :— Y^j=i 3~ r denotes the nth harmonic number of order r and H n := 

i?^ 1 . Summing the residue contributions gives an alternative exact formula for 
EJ3 n , from which the asymptotic expansion (|1.2|) (as well as higher-order terms) 

(r) 

can be read off easily using standard asymptotics for H„ and Stirling's formula; 
we omit the details. 

This completes the proof of Theorem 11.11 

Remark 4.1. We can calculate E K n in the same fashion (and somewhat more 
easily), by replacing the bit-index function b by the constant function 1. Following 
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this approach, we obtain first the following analogue of (|3.1[) : 

Then the residue contributions using Rice's method are 

• 2n(H n — 2 — -), at the double pole at z = 1; 

• 2(H n + 1), at the double pole at z = 0. 

Summing the two contributions gives an alternative derivation of (|2.ip . 



5. POISSONIZED MODEL FOR UNIFORM DRAWS 

As a warm-up for Section|6l we now suppose that the number of keys (throughout 
this section still assumed to be uniformly distributed) is Poisson with mean A. 



5.1. Key comparisons. We begin with a lemma which provides both the analogue 
of (|2.ip - (|2.2p and two other facts we will need in Section [5] 

Lemma 5.1. In the setting of Theorem \1.3\ with F uniform, the expected number 
of key comparisons is a strictly convex function of A given by 

EK{\) = 2[ (\-y)(e~y -l + y)y- 2 dy. 
Jo 

Asymptotically, as A — >• oo we have 

(5.1) Eif(A) = 2Aln A - (4 - 2 7 )A + 2 In A + 2 7 + 2 + 0(e~ A A~ 2 ) 
and as A — > we have 

(5.2) Eif(A) = ±A 2 + 0(A 3 ). 

Comparing the n — > oo expansion (|2.2[) with the corresponding expansion for 
Poisson(A) many keys, note the difference in constant terms and the much smaller 
error term in the Poisson case. 



Proof. To obtain the exact formula, begin with 

F,K n = / p n (x,y)dydx; 

JO Jx 

cf. (I3.3[) and recall Remark 14.11 Then multiply both sides by e~ A A™/n! and sum, 
using the middle expression in (|3.4[) ; we omit the simple computation. Strict con- 
vexity then follows from the calculation ^p-Eif (A) = 2(e~ A — 1 + A)/A 2 > 0, and 
asymptotics as A -> are trivial: Eif(A) = 2 J \X- y)[\ + 0(y)} dy = iA 2 +0(A 3 ). 
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To derive the result for A — ► oo, letting 1[A] denote 1 if A holds and otherwise, 
we observe 



±EAT(A) 



= A 



poo poo pX 

/ (e-y -l + yl[y<l})y- 2 dy-X / (e"" - l)y- 2 dy + A / y- 1 dy 

JO J X Jl 

poo poo pX p 

- / (e^ - l[y < 1]) y- 1 dy + / e^y" 1 dy + y' 1 dy - 
Jo J\ Ji Jo 



X 

dy 



-A(l- 7 ) + 



pOO 

1 Ay e- y y- 2 dy 



A In A 



+ 7 + y e H y 1 + In A - A 

= Aln A - (2 - 7 )A + In A + 7 + 1 + O^A" 2 ), 
as desired. The calculations 



(5.3) 


p oo 

/ (e^ - l[y < 1]) y 

J 


1 dy = 


-1, 




(5.4) 


poo 

/ (e-^-l + yl[ 2 /<l])y- 


2 dy = 


-(1- 


7), 


(5.5) 


/•oo 

J 


1 dy = 


e- A A- 


1 +0(e- A A- 2 


(5.6) 


poo 

J e ~ vy ~ 


2 dy = 


e- A A- 


2 +0{e- x \- 3 



used at the second and third equalities are justified in Appendix |A] □ 

Remark 5.2. The error term in (|5.ip can, using Lemma IA.2| be refined to an 
asymptotic expansion. Indeed, for any M > 1 it can be written as 

M-l 

e- x (-l) fe+1 fc • k\ A- fe - x + 0(e- A A- M - 1 ). 

k=l 

To handle the number of bit comparisons, we will also need the following bounds 
on the moments of AT(A). Together with Lemma f5.1[ these bounds also establish 
concentration of K(X) about its mean when A is large. For real 1 < p < oo, we 
let ||W|| P := (E\W\ p ) 1/p denote L^-norm and use E,{W]A) as shorthand for the 
expectation of the product of W and the indicator of the event A. 

Lemma 5.3. For every real p > 1, there exists a constant c p < oo such that 
\\K(X) -Eif(A)|| p < CpX forX>\, 
||A-(A)|| p <c p A 2 /f forX<l. 

In particular, VarAT(A) < c^A 2 for all A > 0. 

Proof. We use the notation of Theorem 11.31 with F uniform [so that A"(A) = Km 
with A^ distributed Poisson(A)] and write n n := EK n for n > 0. 

(a) The first result is certainly true for A > 1 bounded away from oo. For A — > oo 
the result can be established by Poissonizing standard Quicksort moment calcu- 
lations, as we now sketch. (Although the following argument is valid for all p > 1, 
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the reader that so prefers may assume that p is an even integer.) We start with 

(5.7) \\K(\)-EK(\)\\ p <\\K n - Kn \\ p + \\k n -EK(\)\\ p 

and proceed to argue that the first term on the right is asymptotically linear in A 
while the second term is o(A). 

To handle the first term, observe that 

\\K N - k n \\p = E\K N - k n \ p = EE[\K N - k n \p | N]. 

But 

E[|*Gv -K N \»\N = n} = E\K n - n n \ p ; 

by the comments at the very end of Section [2] this equals (1 + o(l)) ^E \K\ p ^j n p as 

n — > oo and so can be bounded for all n by a constant times n v . Thus one need 
only observe that E N p = (1 + o(l))A p as A — !• oo to complete treatment of the first 
term on the right in ()5.7ll . 

To treat the second term in RHS (|5.7[ ) as A — » oo, one can show using (|2.2p 
and (|5.ip and the normal approximation to the Poisson that 

\\k n -EK(X)\\ p = (l + o(l))2||7Vln7V-AlnA|| p = (1 + o(l))2||Z|| p A 1/2 In A = o(A) 

where Z has the standard normal distribution. We omit the details, 
(b) For A < 1 we use 



EK P (X) < E 



v 



:N>2 



n-2 



< E [N 2p ; N>2] = \ 2 J2 e- X —^n 2p < c p X 2 



n=2 



provided c p is taken to be at least the finite value [S^°=2( n2p / n 0] '• ^ 

5.2. Bit comparisons. We now turn our attention from K(\) to the more inter- 
esting random variable B(X), the total number of bit comparisons. We discuss first 
asymptotics for the mean /x un if (A) and then the variability of B{\) about the mean. 
In our next proposition we will derive the asymptotic estimate (( 1 . 5|) by applying 
standard asymptotic techniques to the exact formula (ll.4j) . 

Proposition 5.4. Asymptotically as A — > oo, we have 

/W(A) =E5(A) = A(lnA)(lgA) - Cl AlnA + c 2 A + 7r A A + 0(logA). 

Proof (outline). Recalling ()1.4j) and noting that for x > we have 

it follows that fi(X) = Munif(A) has the harmonic sum form 

oo 

M (A) = 2^2^(2^A), 

3=0 

rendering it amenable to treatment by Mellin transforms, see, e.g., [TO] or [llj . 
Indeed, it follows immediately that the Mellin transform /i* of [i is given for s in 
the fundamental strip {seC:— 2<Res<— l}by 

H*(s) = 2g*(s)A( S ) 
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in terms of the Mellin transform g* of g and the generalized Dirichlet series 

oo 1 

A( S )=E 2i(S+1) = I3W" 

3=0 

But it's also easy to check using the integral formula for g that 

r( s ) 



and so 

2T(s) 



9*(s) = 
M*(*) = 



(s + l)s(l - 2 S + 1 )' 

The desired asymptotic expansion for /i(A) (including the remainder term) can then 
be read off from the singular behavior of fi*{s) at its poles located at s = — 1 (triple 
pole), s = — 1 — ifik for k € Z \ {0} (simple poles), and s = (double pole), 
paralleling the use of Rice's method for E B n in Section @] □ 

In order to move beyond the mean of -B(A), we define 
I k j := [(j-l)2- fc ,j2- fc ) 

to be the jth dyadic rational interval of rank k, and consider 

-Bfc(A) := number of comparisons of (fc + l)st bits, 
fly(A) := number of comparisons of (fc + l)st bits between keys in Jj. 

Observe that 

oo oo 2 fc 

(5.8) fl(A) = 5> fc (A) X»;.. ; ;Ai. 

fe=0 fc=0j"=l 

A simplification provided by our Poissonization is that, for each fixed k, the vari- 
ables Bkj(X) are independent. Further, the marginal distribution of Bk.j(X) is 
simply that of K (2~ fc A). 

Remark 5.5. Taking expectations in (15.81) . we find 



oo 

(5.9) Atunif(A) = E B(X) - ^ 2 fc E X(2- fc A) 

A:=0 



If one is satisfied with a remainder of O(X) rather than O(logA), then Proposi- 
tion [53] can also be proved by means of (|5.9p . This is done by splitting the sum 

J2kLo there into X)!=o^ an< ^ Sfe°=Lig A J+! an< ^ utmzm S (EH) (to the needed order) 
for the first sum and (|5 . 2|) [or rather the simpler ~EK(X) = 0(X 2 ) as A — >■ 0] for the 
second. We omit the details. (See also Section [5] where this argument is used in a 
more general situation as part of the proof of Theorem ll.30 

Moreover, we are now in position to establish the concentration of B(X) about 
A t unif(A) promised just prior to (|1.6[) . 

Proposition 5.6. There exists a constant c such that VarB(X) < c 2 A 2 for < 
A < oo. 
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Proof. For < A < oo, we have by (|5.8p . the triangle inequality for || • H2, indepen- 
dence and B k j(X) = K(2~ k X), and Lemma IBT31 with c := C2 Y^k=o 2 _fe//2 , 

00 00 
[VarB(X)} 1 ^ 2 < ^[VarBfc(A)] 1 / 2 < ^[2 fc Var K^X)] 1 ' 2 < cA. n 

fc=0 k=0 



Our next proposition extends the previous one but is limited to A > 1. 

Proposition 5.7. For any real 1 < p < 00, there exists a constant c' p < 00 such 
that 

\\B(X)-EB(X)\\ p <c' p X }orX>\. 

Proof. Because L p -norm is nondecreasing in p, we may assume that p > 2. The 
proof again starts with use of the triangle inequality for || ■ || p : For < A < 00 we 
have from (15.81) that 



k=0 



(5.10) 115(A) - EB(A)|| P < ]T \\B k (X) -EB k (X)\\ p . 

Further, 



2" 

B k (X) - EB k (X) =J2[B k>j (X) - EB k>j (X)], 
3=1 

where the summands are independent and centered, each with the same distribution 
as K(2- k X) - EK(2- k X). Hence, by Rosenthal's inequality Theorem 3] (see 
also, e.g., [121 Theorem 3.9.1]) and Lemma [S~3l 



\\B k (X)~EB k (X)\\ p < h (2 k /P\\B ktj (X) -EB k , J (X)\\ p + 2 k ^\\(B k ^(X) -EB ktj (X))\ 

= b 1 2 k / p \\K(2- k X) - EK(2- k X)\\ p + b 1 2 k ' 2 \\K{2- k X) - EK (2- fe A)|| 2 

< 6i2' c/p Cp(2(2- fc A) 2/p + 2" fe A) + b 1 2 k/2 c 2 2- k X 

< b 2 2- k/p X 2/p + b 3 2- k / 2 X 

for some constants 61, 62 and 63 (depending onp). Therefore, by (|5.10p . 

||B(A) - EB(A)|| P < b' 2 X 2 ' p + b' 3 X < (b' 2 + b' 3 )X 
when A > 1. □ 

Remark 5.8. For the (rather uninteresting) case A < 1, the same proof yields 
||B(A) - EB(X)\\ P < c' p X 2 / p for p > 2. This inequality actually holds (for some c' p ) 
for all p > 1 ; the case 1 < p < 2 follows easily from (|5.8I) and Lemma [ 



Remark 5.9. In p] it is shown (in a more general setting) that the variables B k (X) 
are positively correlated, from which it is easy to check that Var B(X) — f2(A 2 ) for 
A > 1. We then have \\B(X) -EB(X)\\ P = 9(A) for each real 2 < p < 00. In fact, 
it is even true that [B(X) — EB(A)]/A has a nondegenerate limiting distribution: 
see [4]. 
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6. Mean number of bit comparisons for keys drawn from an 

arbitrary density / 

In this section we outline martingale arguments for proving Theorem 11.31 for the 
expected number of bit comparisons for Poisson(A) draws from a rather general 
density /. (For background on martingales, see any standard measure-theoretic 
probability text, e.g., [2 .) In addition to the notation above, we will use the 
following: 



Pk,j ■= / / 



J fe.3 



fk{x) 

/*(•) 



= (average value of / over I^j) = 1 ' Pk,j> 
= fk,j for all x G h,ji 
= supf k (-). 



Note for each k > that Y^j Pk,j — 1 and that f k : (0, 1) — > [0, oo) is the smoothing 
of / to the rank-fc dyadic rational intervals. From basic martingale theory we have 
immediately the following simple but key observation. 

Lemma 6.1. With f x := f , 

(/fe)o<fe<oo is a Doob's martingale, 

and /ft — > f almost surely (and in L ). 

Our proof of Theorem 11.31 will also utilize the following technical lemma. 



Lemma 6.2. // (as assumed in Theorem \1.3\) the probability density f on (0,1) 
satisfies J /(ln + /) 4 < oo ; then 



l 

*\3 



(6.i) / r(\n + rr<^. 

Jo 

Proof. This follows readily by applying one of the standard maximal inequalities 
for nonnegative submartingales which asserts that for a nonnegative submartingale 
(yfc)i<fe«x> and Y* := sup 1<fe<00 Y k we have 



(6.2) Er < 



1+ sup E(Y fe ln + Y fc ) 

l<fe<oo 



see, e.g., [T21 Theorem 10.9.4]. The process (Y k := /fe(ln + fk) 3 )i<k<oo is a sub- 
martingale by Lemma 16. II and the convexity of the function x — > x(ln + a;) 3 , and for 
every 1 < k < oo we have 

[ Y k ln+ Y k < 4 [ 1 f k (\n + f k f < 4 / /(ln+ /) 4 < oo, 
Jo Jo Jo 

so (|6.2j) does indeed give the desired conclusion. □ 

Before we begin the proof of Theorem 11.31 we remark that the asymptotic in- 
equality fif(X) > Aiunif(A) observed there in fact holds for every < A < oo. 
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Indeed, 

oo 2 k 

(6.3) fi f (X) = J2J2 EK ( X P^) 

fe=oi=i 



>^2 fe Eif(A2- fe )=/z unif (A), 



fe=0 

where the first equality appropriately generalizes (|5.9p , the inequality follows by the 
convexity of EK(X) (recall Lemma f5 . 1 [) . and the second equality follows by (|5.9I) . 
Furthermore, strict inequality A*/(A) > yU un if(A) holds unless p k .j — 2~ h for all k 
and j, i.e., unless the distribution F is uniform. (This argument is valid also if F 
does not have a density.) 

Proof of Theorem ] Assume A > 1 and, with m = m(A) := [IgA], split the 
double sum in (16.31) as 



m 2" 

(6.4) ^(X) = ^2^EK(Xp kJ )+R(X), 

fc=0 j=l 

with i?(A) a remainder term. Our first aim is to show that 

oo 2 k 
k=m-\-l j—l 

Since E-ftT(-) is nondecreasing, we have the inequality 

oo 

Eff(Apfcj) < ^ E K(2 n+1 ) l[2 n < Ap fc)i < 2" +1 ] 

n=— oo 
oo 

< 2-"E^(2 n+1 )Ap M l[Ap fej >2"]. 

n=— oo 

Now if Apk j > 2 n , then for x G Ik,j we have 

f(z) > fk(x) = 2 k p kJ > 2 k X~ 1 2 n > 2 k ~ m + n , 

Hence 

oo 

EK{X Pktj ) < J2 2- n -EK(2 n+1 )X Pk , J l[X Pk . J > 2 n ] 

n— — oo 



oo „ 

<A 2- n EK{2 n+1 ) / f k (x) l[2 k - m+n < f*{x)]dx 

n=-oo 



and therefore 

2' 



2 oo 

^EX(Ap M )<A ^ 2-"E/^(2" +1 ) / f k {x) l[2 k - m+n <f*(x)]dx 

7 = 1 n=-oo J ° 

„1 oo 

<A/ /*(ar) Y 2-"Eif(2" +1 )l[2^" l+ " < f*{x)]dx. 



THE NUMBER OF BIT COMPARISONS USED BY QUICKSORT 



15 



From this we conclude 



r i oo oo 

R(X)<X f*(x) 2- n -EK(2 n+1 )J2^ k+n < f*{x)]dx 

■'° n=-oo fc=l 

„1 oo u{x,k) 

= X f*(x)J2 E ^ n EK(2 n+1 )dx, 



k=l n= — oa 



with ^(x, fc) := [lg/*(a;)J — k. We proceed to bound the sum on n here. If v < 0, 
then using the bound of (constant times A 2 ) on EiiT(A) from Lemma [5.11 we can 
bound the sum J2 n <» E K(2 n+1 ) by a constant (say, b') times 2 17 , while if v > 
we can again use the estimates from Lemma 15.11 to bound, for some constants 
6i, 62, b" the same sum by 

V 

bi + Y, 2 ~™ fo 2 (n + l)2 n+1 < &V. 

n=l 

Therefore, for another constant b we have 

E E 2" ,l E J ftT(2 n+1 ) < E &V(s,fc) + E 
fc=m=-oo fc=i fe=Lig/*(^)J 

< t ln + Z*^)] 3 + 2&' < (1 + [ln+ /* (x)] 3 ) . 

Using Lemma 16.21 we finally conclude 



R(X)<bX / /*[l + (ln + /*) 3 ]=0(A). 
Jo 

Plugging R(X) = 0(A) and the consequence 

EK(x) = 2a; In x - (4 - 2 7 )x + 0(x 1/2 ), 
which holds uniformly in < x < 00, of Lemma |5. II into (|6.4p . we find 

m 2 fc 

= E E [2Ap felJ (ln A + lnp kli ) - (4 - 2 7 )Ap fcj + O (W,,) 1/2 )] + 0(A) 



fc=oi=i 

m r 



E 2AlnA + 2AEPfe,ilnpfcj - (4 - 2 7 )A + O ( A 1 / 2 2 fe / 2 



/^unif 



3=1 



0(A) 



771 r. 

(A) + 2AE //feln/ fe + 0(A), 



where we have used the Cauchy-Schwarz inequality at the second equality and 
comparison with the uniform case (/ = 1) at the third. 

But, by Lemma \6. 11 (I6.1[) . and the dominated convergence theorem, 



(6.5) 



fk In fk 



/In / as k 00, 
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from which follows 



Hf (A) = Munif (A) + 2A(lg A) / / In / + o(A log A) 
= Munif(A) + 2A(lnA) //lg/ + o(AlogA), 




as desired. 



□ 



Remark 6.3. If we make the stronger assumption that 



/ is Holder(a) continuous on [0, 1] for some a > 0, 



then we can quantify (16. 5|) and improve the o(A log A) remainder in the statement 
of Theorem 11.31 to 0(A). A proof is provided in Appendix iBl 



Recall the operation of Quicksort described in Section [2] Suppose that the 
pivot [call it x = O.x(l) x(2) . . .] has its first mi bits x(l),x(2), . . . , x(mi) all equal 
to 0. Then the subarray of keys smaller than x all have length-mi prefix consisting 
of all Os as well, and it wastes time to compare these known bits when Quicksort 
is called recursively on this subarray. 

We call BitsQuick the obvious recursive algorithm that does away with this 
waste. We give one possible implementation in the boxed pseudocode, which 
calls for some explanation. The initial call to the routine BitsQuick(A, m) is to 
BitsQuick(Ao, 0), where Ao is the full array to be sorted; in general, the routine 
BitsQuick(A, m) in essence sorts a subarray A of Aq in which every element has 
(and is known to have) the same prefix of length m 

There, for mi = 0, 1, . . . , we use the notation L mi (y) for the result of rotating 
to the left mi bits the register containing key y — i.e., replacing y — .y(l) y(2) . . . 
by .y(mi + 1) y(mi + 2) .... The input m indicates how many bits each element 
of the array A needs to be rotated to the right before the routine terminates, 
and R m (A) (in the last line of the pseudocode) is the resulting array after these 
right-rotations. The symbol || denotes concatenation (of sorted arrays). (We omit 
minor implementational details, such as how to do sorting in place and to maintain 
random ordering for the generated subarrays, that are the same as for Quicksort 
and very well known.) The routine BitsQuick(A, m) returns the sorted version 



A related but somewhat more complicated algorithm has been considered by 
Roura [22l Section 5]. 

The following theorem is the analogue for BitsQuick of Theoreni ll.il 

Theorem 7.1. // the keys X\, . . . ,X n are independent and uniformly distributed 
on (0, 1), then the number Q n of bit comparisons required to sort these keys using 
BitsQuick has expectation given by the following exact and asymptotic expressions: 



7. An improvement: BitsQuick 



of A. 



EQ« = E(-l) fc (^ 



i [ 2(fc- 2) 
1 - 2" fc 



1 _ 2 -(fc-i) 



fc-4 



+ 2nH n -5n + 2H n + 1 
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where, with j3 :— 27r/ln2 as before, 



7 



15 



( 



hT2 + 2 > = 139 




2 



and 



1 



E 



T(-l-i[3k)n if3k 



It 



In 2 



l + 0k 



fc£Z: k^O 



is periodic in \gn with period 1 and amplitude smaller than 2 x 10 '. 

Proof. We establish only the exact expression; the asymptotic expression can be 
derived from it using Rice's method, just as we outlined for E B n in Section |U 
Further, in light of the exact expression (11.11) for EB„, we need only show that the 
expected savings E B n — E Q n enjoyed by BitsQuick relative to Quicksort is given 



The routine BitsQuick(A, m) 

If \A\ < 1 

Return A 
Else 

Set A_ <- and A + <- 

Choose a random pivot key x = 0.a;(l) x(2) . . . from A 
If x(l) = 
Set mi <!— 1 
While x(mi + 1) = 

Set mi ■(— mi + 1 
For y £ A with 
If y < x 

Set y <- L™ 1 (y) and then A_ «- A_ U {y} 
Else 

Set A + <r- A + U {y} 
Set A_ «- BitsQuick(A_, mi) and 

A + <- BitsQuick(A + , 0) 
Set A <- A- || {.t} II A+ 
Else 

While x(mi + 1) = 1 

Set mi <— mi + 1 
For y G A with 

If y < a; 
Set A_ <- A_ U {y} 

Else 

Set y <- L mi (y) and then A+ <- A + U {y} 
Set A_ «- BitsQuick(A_, 0) and 

A + <— BitsQuick(A + , mi) 
Set A «- A_ || {.t} II A+ 
Return R m (A) 
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by the expression 



(7.1) EB n -E Q n 



k=2 



(2nH n 



on 




We use the order-statistics notation X 



(i)' 



(fc-3)(fc-2) 
(fc-1) [l-2-( fc -!)] 



Xr n \ from Section [3] To de- 



rive (|7.1[) . we will compute the (random) total savings for all comparisons with 
Xu} as pivot, sum over i = l,...,n, and take the expectation. For convenience, 
we may assume that the algorithm chooses a pivot also in the case of a (sub) array 
with exactly 1 element, although it is not compared to anything; thus every key 
becomes a pivot. Observe that Xr{\ is compared as pivot with keys X^j\, . . . , X^ 
(except itself) and with no others, where L = L(i) and R = R(i) with L < i < R 
are the (random) values uniquely determined by the condition that Xu\ is the first 
pivot chosen from among -^(l), ■ ■ ■ , Xir\ but not (if L ^ 1) the first from among 
■ ■ ■ i^(-R) nor (if R ^ n) the first from among Im, . . . ,X/r+u. Hence, 
X(i) is compared as a pivot with R — L other keys. The comparisons with Xu\ as 
pivot are performed with the knowledge that all the keys Iin,..,, Xtm have values 
in the interval (Xil-\\, Xm+i\), where if L = 1 we interpret X(q) as = .000. . . 
and if R — n we interpret Xt n +u as 1 = .111 . . .. The total savings gained by this 
knowledge is Y,jeiL,R]:&iM X (L-i),X(R+i))-i\ = (R~L) [6(X (i _i ); X (R+1) ) - 1], 
where we recall that b(x, y) denotes the index of the first bit at which x and y differ. 
Therefore the grand total savings is 

n 

B n -Q n = Y,l R ( l ) ~ L (i)} [b {X (L(i) - 1) ,X m)+1) ) - 1] 
1=1 

E (r-l)[b(X {l _ 1) ,X {r+1) )-l}\{i:(L(i),R(i)) = (l,r)}\, 

(l,r): l<l<r<n 

and so by independence we have 

EB n -EQ n = E (r-l)[Vb(X il _ 1) ,X (r+1) )-l]E{i:(L(i),R(i)) = (l,r)} 

(i,r): l<l<r<n 

The second expectation on the right is easily computed: 



E 



{z : (L(i), R(i)) = (I, r)}| - £ P [( L «' R M = ( Z > r)] = (r - i + 1)0(1, r) 



where, abbreviating r — I to and writing "xor" for "exclusive or" , 

( (d + I)- 1 - 2(d + 2)- 1 + {d + 3)- 1 if I ^ 1 and r ± n 

6(1, r) = i (d + l)- 1 -(d+ 2)- 1 if I = 1 xor r = n 

[(d+l)- 1 if I = 1 and r = n, 

so that 



E 



{/ : (ii/)./?i;n = i/./i} : { (d + 2)- 1 



2[(d + 2)(d + 3)]~ 1 ifZ^landr^ 



if I = 1 xor r = n 
if I = 1 and r = n, 
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Therefore 



E B n — E Q„ 

- 2 E (r -Z + 2Kr-* + 3) mX^X(r +1) )-l] 

(l,r);2<l<r<n-l V A ' 

n — 1 n 7 

T — 1 ,„ , ,„ „ x , \ -> ri — t 



E r-T [E6(0, ^h-d) - 1] + E rTT-^ P i) - 1] 

r=l (=2 

(n-1) [E 6(0,1) - 1] 



E (r _ z+ r 2) V-f + 3 ) [E^-i)^(. + i))-l] 

(i,r):2<i<r<n-l v n ' 



n— 1 

r — 1 



2 E7TT [E6(0 ' X(r+1))_1] 



r=l 
n n 



2 S| 2 ^ra EHI(i),I(i)) 

" 7-2 

+ 2 ^^_E6(0,X fa . ) )- gn 

3=2 3 



(7.2) - 2£> n + 2E n - q n , 



where: at the second equality we have used symmetry and the observation that 
6(0, 1) = 1; the last two sums are denoted D n and E n , respectively; and 



n— 1 

x ) 1 



(r-l + 2)(r-l + 3) ^ r + 1 

(l.r):2<l<r<n-l V A ' r=2 ' 



(7.3) = 2n# n - 5n + 2if n + 1. 



The expectation E b{Xn\ , Xrj\ ) may be computed (for 1 < i < j < n) by recalling 
the joint density g n ,i,j of (X/j\,X(j\) given at (|3.2I) . We then find 



E 6(JT W ,I (i) ) = ^P [b(X (i) ,X (j) )>£ + l] 



1=0 
oo 2* 



EE// 9n,i,j(x,y)dxdy 

t=0 m=l "'"'(m-l)2- f <a:<iy<m2- f 



oo 2' 



x - aj) i_<_1 (l - y) n ~ j dxdy. 
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Now, suppressing some computational details, 
j-i-2 



EE 



i=l j=i+2 
n n 



U - i)(J ~ i + 1) \i ~ 1, 1,3 ~ i ~ 1, 1, n - j 



x >-\ y - x y-^(i- y y 



= E E(j'- j - 2 ) Li 



-1 j=i+2 

E( fc -3)(:)( y 



-i + l,n-j 



fc=3 



\fc-2 



E 

i=0 



^(l-y)"-*-' 



= E( fc - 3 ) ^j(»-^)*- 2 [l- 

= jE(-l) fc ( fc -3)(fc-2)f?)( y -.^- 2 , 



fe=3 



and so 



oo 2 £ 

l=0 m=1 JJ (m-l)2-t<x<y<m2-t 



OO 

5E* 



£=0 



0<a;<i/<2 



i£(-l) fc (fc-3)(fc-2)(Jj(y-x)* 

X>l)^-3)(fc-2)(f)(y-.*) 



dxdy 



dxdy 



1 ™ 

-5 B-D 



t (t-3)(4-2)/n 



fc=3 



(k i)k \kj ^ 



(7.4) =li(-^ 



.! (fc-3)(fc-2) 



(fc-l)[l-2-( fe - 1 )]' 

Similarly (and somewhat more easily), one sees (for 1 < j < n) that 

00 

Eb(0,X u) )=J2P[b(0,X U) )>£+l] 

e=o 



and that 
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whence 



En 



E 

1=0 



k=2 



J2(-l) k - 1 (k-2) 



k=2 



,fc-l 



dy 



k _ 1 k-2fn 



E2- 

1=0 



-fk 



(7.5) 



k=2 



Plugging (|7.3[) ~ (|7.5[) into (|7.2I) . we obtain (|7.ip . thus completing the proof. 

Appendix A. Some calculus 



□ 



The following calculus lemmas establish the calculations (|5.3p - (|5.6p used in the 
proof of Lemma 15.11 

Lemma A.l. Define 



7o(z) := / e y y z dy, 
Jo 

/>oo 

7l (z):= / (e-y-l[y<l])y z dy, 
Jo 

72 (z) := / (e-y - 1 + yl[y < 1]) y z dy, 



Rez > -1; 
Rez > -2; 

-3 < Rez < -1. 

Then the following identities hold for z — 1: 
j (z) = T(z + l), 

7l (z) = (z + i)- 1 ^ + 1) - 1] = (z + i)- 1 ^ + 2) - 1], 

72(^) = (^ + ir 1 [l+7i(2 + i)], 
and so 7l (-l) = r'(l) = -7 and T2 (-2) = -[1 + 7i(-l)] = -(1 - 7 )- 

Proof. The identity for 70 is the definition of the function T, and the identities 
for 71 and 72 follow by integration by parts. Since 71(2:) is continuous in z for 
Rez > —2, it follows from the identity for 71 (z) by passage to the limit that 
7i(— 1) = r'(l) = —7. Finally, we obtain the desired value of 7 2 (— 2) simply by 
plugging z = — 2 into the identity for 72 (z). □ 

Let s- denote the falling factorial power s(s — 1) 

Lemma A. 2. For any fixed seC and M — 0, 1, . . 



■(s-k + 1). 
and all X > 1, 



e~ V dy = e- x X s 



M-l 



E^ 



0(X 



k=0 



(The implicit constant depends on s and M , but not on X.) 
Proof. For A > 0, let I(X; s) := f^e^y 3 dy. If Res < 0, then 



/ e~ y y Rcs dy < / e~ y X Rcs dy = A Ro V 



which yields the result for Res < M = 0. 
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Further, integration by parts yields 

I(X;s) = er x X s + sI(X;s - 1), 

and the result for Res < M follows by induction on M. Finally, if Res > M, we 
use the result just proven with M replaced by some M' > Res. □ 



Appendix B. Proof of Remark | 

We prove that if 

(B.l) / is H61der(a) continuous on [0, 1] for some a > 0, 

then, as claimed in Remark 16.31 the conclusion of Theorem 11.31 holds with the 
remainder o(AlogA) improved to O(A). 

Proof. Using the notation m = m(\) := |~lgA] of the proof of Theorem [173] ap- 
pearing in Section [51 it follows from that proof that we need only establish the 
asymptotic estimate 



E 

fc=0 



Jfklnh- //In/) =0(1) 



as A — > oo, and for this it is clearly sufficient to show that 

(B.2) f k (x) \nf k (x) - /(X) In = 0((k + l)2- fe ") uniformly in x £ [0, 1]. 

But indeed (jB.ll) evidently implies 

f k (x) - f(x) = 0(2- ka ) uniformly in x e [0, 1], 
and thence, routinely, (IB. 2ft . □ 
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